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Abstract 

We  start  with  postulating  a  Poisson  regression  model  with  a  random 
error  term 

[X(x)?]y   ,f  u 

PY(y|x,0  =  ^ e  Mx;\  y  =  0,  1,  2,  ... 

where  x  is  assumed  to  be  a  nonstochastic  variable;  £  is  a  random 

2 
variable  having  an  x  distribution  with  2r  degrees  of  freedom.   Then 

the  marginal  distribution  of  Y  is  the  negative  binomial  distribution 

with  probability  function 

„  fv\   ,   T(r+y)    r   1   ,r  r  X(x)  ,y 
PYW  "  r(y+l)r(r)  Ll+X(x)J   ll+A(x)J 

for  y  =  0,  1,  2,  ...;  r  >  0.   We  define  a  binary  response  variable  Z 

such  that  Z  =  1  iff  Y  >   1  and  Z  =  0  iff  Y  =  0.   If  X(x)  =  exp  (cc+6x)  , 

a  generalized  logit  model 

1  -  P0(x) 

log  [ rr — ]  =  a  +  g  x 

P0(x) 

follows,  where  Pn(x)  =  P  (Z=0|x).   To  see  the  practical  usefulness  of 
the  generalization,  we  fitted  the  model  to  Ashford-Sowden's  data.   As 

compared  with  an  ordinary  logit  model,  significant  improvement  in  good- 

2 
ness-of-fit  has  been  observed  in  terms  of  the  x  goodness-of-f it 

statistic.   Especially,  it  is  remarkable  for  the  tail  areas  of  p»(x). 
Some  more  versions  of  qualitative  response  models  will  be  also  dis- 
cussed in  their  connection  to  the  Poisson  process. 


1.   Introduction 

The  purpose  of  the  present  paper  is  to  propose  a  possible  general- 
ization of  the  logit  transformation.   The  logit  transformation  of  a 
binomial  probability  has  been  widely  used  to  analyze  qualitative  data 
in  socio-economic  investigations  as  well  as  in  biometric  research.   The 
generalized  logit  model  developed  here  involves  only  one  more  parameter 
than  the  conventional  logit  model.   Therefore,  the  simplicity  of  the 
latter  model  is  essentially  preserved  by  our  generalization.   We  base 
our  derivation  on  the  following  assumption:   a  binary  response  may  be 
observed  as  an  indicator  for  an  underlying  (possibly  unobservable)  non- 
homogenous  compound  Poisson  process:   i.e.,  it  indicates  whether  or  not 
the  number  of  events  occurred  in  the  process  exceeds  a  fixed  unknown 
threshold.   The  Poisson  rate  parameter  is  assumed  to  depend  on  some 
exogenous  factors  as  well  as  a  multiplicative  random  component.   Since 
the  Poisson  process  is  derived  on  the  basis  of  a  few  weak  assumptions, 
it  would  be  fair  to  claim  that  our  approach  gives  another  natural  inter- 
pretation to  the  logit  model. 

The  genesis  of  the  probability  integral  model,  including  the  logit 
and  probit  models,  is  usually  described  by  postulating  a  hypothetical 
random  variable  called  tolerance,  the  variation  of  which  causes  the 
randomness  in  the  binary  response.   If  the  tolerance  has  a  logistic 
distribution,  the  logit  model  follows  (See,  for  instance,  Cox  (1970).) 
In  Section  2  we  incidentally  propose  a  probability  integral  model  of  a 
chi-square  distribution.   That  is,  if  we  have  a  Poisson  process  with  non- 
homogenous  rate  parameter  as  an  underlying  structure  for  a  binary 
response,  what  we  call  the  chisquit  model  immediately  follows. 


In  order  to  see  practical  relevance  of  our  generalization,  we  fit  th 
model  to  some  empirical  data.   The  results  given  in  Section  4  show  that 
our  generalization  improves  the  statistical  fit  of  the  model  quite  sub- 
stantially, albeit  it  causes  no  essential  difficulties  in  computation. 

2.   Poisson  Regression  and  Chisquit  Model 

To  begin  with  let  us  consider  a  random  phenomenon  where  events 
occur  in  a  Poisson  process  with  non-homogeneous  rate  parameter  X.   For 
the  time  being,  we  assume  that  the  variation  of  X  is  fully  explained 
by  some  independent  variables  x.   Then  the  number,  say  Y,  of  events  to 
occur  in  an  interval  of  fixed  length  t  and  for  fixed  x  is  a  Poisson 
random  variable  with  probability  function 

[A(x)t]y   -X(x)t 
(2.1)  PY(y  x)  =  ^j e    ~ 

for  y  =  0,  1,  2,  ...,  where  A (x)  is  a  function  of  x  and  its  range  is 
limited  to  positive  half  of  the  real  line.   If  X(x)  is  specified  up 
to  its  functional  form  and  a  random  sample  of  Y  is  observed  with 
corresponding  value  of  x,  then  we  can  make  inferences  about  unknown 
parameters  in  X(x).   This  is  called  the  Poisson  regression  analysis, 
special  cases  of  which  have  been  investigated  by  Gart  [1964]  and 
Jorgenson  [1961].— 

If  we  let  T  be  the  continuous  amount  of  time  (or  area,  distance, 
etc.)  required  to  observe  the  r-th  event  in  the  Poisson  process  with 
rate  parameter  X (x) ,  starting  from  an  arbitrary  point  in  the  process, 
the  nonnegative  random  variable  T  has  a  Gamma  distribution  with 
density  function  f(t)  =  X (Xt)r_1e"Xt/r(r)  for  t  >  0  and  f(t)  =  0 
elsewhere.   Hence  we  obtain  an  obvious  equality 
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(2.2)  P(Y  >  k|x,  t)  =  P(T  <  t)  =  F  2  [2tA(x)]  , 

X2k 

2 
where  F  2  is  the  cumulative  distribution  function  (cdf)  of  x  -distribu- 

x2k 

tion  with  2k  degrees  of  freedom.   (See,  for  instance,  Johnson  and  Kotz 

[1969],  p.  98.) 

Now  let  us  define  the  following  binary  response  model  on  the 

Poisson  process:   a  qualitative  change  (catastrophe)  that  concerns  us 
occurs  if  and  only  if  Y  >  k;  namely,  a  binary  random  variable  Z  equals 
one  if  and  only  if  Y  21  k  and  zero  elsewhere.   This  is  a  version  of  multi- 
hit  model  used  in  biological  application.   As  a  practical  example  this 
relates  to  the  case  where  Y  stands  for  a  random  accumulation  of  causes 
of  a  certain  catastrophic  change:   if  the  number  of  an  individual's 
accumulated  causes  exceeds  a  threshold  k,  then  a  catastrope  occurs  to 
him;  otherwise,  he  remains  in  the  same  state.   The  degree  of  the  change 
would  be  somehow  related  to  the  amount  by  which  Y  exceeds  k.  However, 
what  we  are  concerned  with  and  actually  observe  is  a  binary  response: 
whether  or  not  the  catastrophe  occurred  to  each  individual.   The  rate 
parameter  A  which  is  intrinsic  to  each  individual  is  regarded  as  indicat- 
ing his  proneness  to  the  catastrophe  which  is  supposed  to  be  determined 
by  his  characteristics  as  well  as  some  exogenous  factors. 

Another  example,  presented  below,  is  referring  to  limitation  of 
observability.   It  very  often  happens  that  due  to  certain  limitation  of 
observability  or  some  other  reasons,  we  witness  only  an  all-or-none 
response:  whether  or  not  at  least  one  event  has  occurred  to  each 
individual  in  an  interval  of  fixed  length.   To  put  it  differently,  the 
number  of  events  which  might  have  occurred  is  unobservable  or  outside 
our  concern.  A  typical  example  may  be  a  survey  research  on  possession 
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of  a  certain  durable  goods,  say  a  car.   The  survey  is  often  concerned 
only  with  a  binary  response:  whether  or  not  each  individual  possesses 
a  car.   It  conceals  the  number  of  cars  he  possesses  as  well  as  the 
quality  of  his  car. 

If  we  adjust  the  scale  of  measuring  length  so  that  t  =  1/2,  then 

2/ 
we  obtain  a  binary  response  model— 

(2.3)  P(Z=l|x)  =  F  2  [A(x)] 

X2k 

The  unknown  parameter  k  may  or  may  not  have  definite  physical  meaning. 
The  function  X (x)  is  usually  specified  as  either  linear,  exponential, 
or  multiplicative  function.   If  the  value  of  k  is  not  determined 
theoretically,  then  it  should  be  regarded  as  a  parameter  that  must  be 
estimated  from  data  simultaneously  with  the  parameters  in  A (x) . 

The  underlying  structure  of  the  probit  and  logit  models  is  often 
described  by  postulating  the  existence  of  an  unobservable  (hypothetical) 
random  variable  called  the  tolerance:   namely,  Z=l  if  and  only  if  the 

tolerance,  say  U,  falls  below  the  threshold,  c(x)  say,  determined  by 

3/ 
an  individual's  characteristics  x.—   If  the  cdf  of  the  tolerance  is 

F.. ,  then 

(2.4)  P(Z=l|x)  =  Fu[c(x)]  . 

If  U  has  either  normal  or  logistic  distribution,  the  probit  or  logit 
model  follows,  respectively.   Our  model,  straightforwardly  derived 
from  a  nonhomogeneous  Poisson  process,  may  be  regarded  as  an  alternative 
specification  of  the  tolerance  model,  and  it  could  be  appropriately 
called  the  chisquit  model. 
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Since  U  is  intrinsically  a  hypothetical  variable,  there  is  no 
reason  at  all  to  confine  its  distribution  to  a  class  of  symmetric 
distributions.   In  some  practical  applications  it  might  be  more  adequate 

to  assume  that  the  tolerance  is  distributed  with  some  skewness  and  its 

2 
range  is  bounded  below.   Since  x  -distribution  is  asymptotically  normal 

as  its  degrees  of  freedom  become  large,  it  may  be  fair  to  say  that  the 

probit  model  is  obtained  as  a  limiting  form  of  the  chisquit  model. 

As  for  estimation,  no  particular  difficulty  arises  if  the  value  of 

k  is  specified  a  priori.  We  can  employ  essentially  similar  methods  to 

that  used  to  estimate  the  probit  and  logit  models.   If  k  is  unspecified 

a  priori,  it  could  be  estimated  simultaneously  with  the  parameters  in 

2 
The  distribution  of  log(x  )  approaches  normal  distribution  more 

2 
rapidly  than  that  of  x   itself.   (See  Johnson  and  Kotz  [1970],  p.  181.) 

Therefore,  if  the  varying  rate  parameter  is  reasonably  specified  as 

an  exponential  function  such  as  exp(a+gx),  then 

(2.5)  P(Z=l|x)  =  P(log  x22k  1  a  +  6x)  , 

the  right-hand  side  of  which  could  be  very  closely  approximated  by  the 
cdf  of  the  normal  distribution  with  mean  log(2k)  and  variance  1/k, 

unless  k  is  extremely  small.  Moreover,  if  X (x)  is  a  multiplicative 

a 
function  such  as  ax  ,  we  have  a  log-linear  function  of  x  instead  of 

a  linear  one  on  the  right-hand  side  of  (2.5).   The  above  considera- 
tion leads  us  to  the  following. 

Suppose  that  the  linear  or  log-linear  probit  model  fits  given  data 
very  well.   Then  this  suggests  that  the  varying  rate  parameter  in 
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the  assumed  Poisson  process  might  be  just  appropriately  specified  as  an 
exponential  or  multiplicative  function  of  x.   Of  course,  it  is  fair  to 
say  that  there  is  no  way  of  discriminating  a  model  with  linear  X (x) 
and  k  large  enough  to  permit  normal  approximation  from  another  alter- 
native model  with  exponential  X(x)  and  small  k.   In  either  of  these 
two  cases  the  linear  probit  model  will  fit  data  very  well. 

3.   Generalized  Logit  Transformation 

One  of  the  apparent  shortcomings  of  the  chisquit  model  is  the  follow- 
ing: we  assume  that  the  rate  parameter  \    for  each  individual  is 
completely  determined  by  a  finite  number  of  explanatory  variables  x. 
This  is  obviously  unrealistic  and  necessitates  modification  of  the 
model.  Also,  in  practice,  we  need  to  keep  the  number  of  the  variables 
as  small  as  possible.   To  cope  with  this  we  permit  the  rate  parameter 
to  be  a  random  function  of  the  characteristics  set  x,  i.e. 

(3.1)         X  =  X(xH  , 

2 
where  2 ?  is  a  random  variable  having  x   distribution  with  2r  degrees 

of  freedom.   It  should  be  noted  that  2r  need  not  be  an  integer.   Also, 

as  the  model  is  multiplicative,  no  loss  of  generality  is  caused  by 

2 
assuming  the  distribution  of  2£  is  x   instead  of  a  Gamma  distribution. 

Given  E,,   Y  has  a  conditional  Poisson  distribution.   It  is  straight- 
forward to  show  that  Y  is  unconditionally  distributed  as  negative 
binomial  with  probability  function— 

ri  -n         „  «\ri  -   r(r+y)   r   1   -.r  r  X(x)  .y 
(3'2)         PY(y)  "  r(y+l)r(r)  [i?xlx7]   [ITHx7] 

for  y=0,  1,  2,  ...,  and  r  >  0 . 
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Now  let  k  be  a  threshold:   i.e.,  we  have  a  binary  response  Z  such 
that  Z  =  1  if  Y  >  k  and  Z  =  0  otherwise.   We  note  that 

o  3)  p(Y  >  k)  =   i      r(r+y)     f I 1r  r  A<x>  ]y 

U'J;  W>*)  l     r(y+l)r(r)    ll+A(x)J      ll+A(x)J 

-  I,  ,  N  (k,r)    , 
ip  (x)      '        ' 

where 


<3-4)  *(x)  =  iMfxT  • 


I  is  the  incomplete  Beta  function.   A  generalized  logit  transformation 

l-P0(x)1/r 

(3.5)  log  { }  =  log  A(x) 

P0(x) 

where  Pn(x)  =  P(Z=0|x)  and  r  is  a  positive  constant,  may  be  derived 
from  either  of  the  following  two  models.   First,  let  us  suppose  that 
k  =  1,  i.e.,  the  binary  response  indicates  whether  or  not  at  least  one 
event  occurs.   Then  we  have 

(3.6)  p  (x)  =  P(Z=0|x)  = 

[1+A(x)]r 

It  is  straightforward  to  verify  that  this  implies  (3.5).   Second, 
let  5  he  exponentially  distributed,  i.e.,  r  =  1.   Then  model  (3.6)  again 
follows  with  r  replaced  by  k.   Which  model  is  more  appropriate  is  a 
problem  that  should  be  answered  case  by  case  on  a  priori  ground;  that 
is,  the  two  models  may  not  be  discriminated  objectively  by  data.   The 
conventional  logit  transformation  (3.5)  with  r  =  1  corresponds  to  all-or- 
none  binary  response  (i.e.,  k  =  1)  defined  on  a  compound  nonhomogeneous 
Poisson  process  with  exponential  error  distribution  (i.e.,  r  =  1) . 
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n  -f-ftv 

Moreover,  if  we  specify  A(x)  =  e     and  r  =  1,  then  the  familiar 
linear  logit  model  follows.   If  we  specify  a  multiplicative  model 
A(x)  =  ax  ,  the  log-linear  logit  model  follows.   Since  r  is  an  unknown 
parameter  appearing  in  the  error  distribution,  it  should  be  simultaneously 
estimated  with  the  parameters  in  A(x)  from  sample  observations. 

When  we  have  grouped  data,  the  simplest  way  of  estimating  the 
model  would  be  the  so-called  Berkson's  minimum  chi-squares  method. 

The  asymptotic  variance  of  the  empirical  generalized  logit  transforma- 

*  1/r  *  1/r 
tion,  log  {[1-P0   ]/pQ   >,  is  given  by 

1_P0 
(3.7) 


n  r  (l-p0   )  pQ 


where  pn  =  P(Z=0Jx);  pn  is  the  estimate  of  pn  based  on  a  sample  of  size 
n.   Replacing  the  unknown  p„  by  its  sample  estimate  pn,  we  can  apply 
generalized  least  squares  to  obtain  estimates  for  parameters  in  A(x) 

for  a  given  value  of  r.   The  optimal  value  of  r  might  be  found  by 

2 
minimizing,  for  example,  the  x  goodness-of-fit  test  statistic  with 

respect  to  r. 


4.   Numerical  Results 

To  examine  practical  relevance  of  our  generalization  we  fitted  the 
generalized  logit  model  to  Ashford  and  Sowden's  [1970]  data.   Their 
data,  presented  in  tables  after  aggregation,  consisted  of  the  number 
of  coal  miners  in  nine  5-year-wide  age  group  reporting  either,  neither, 
or  both  of  the  respiratory  symptoms,  breathlessness  and  wheeze.   They 
developed  the  bivariate  probit  model  to  analyze  this  data,  which  was 
later  reanalyzed  by  Grizzle  [1971]  and  Mantel  and  Brown  [1973].   In 
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fact,  the  Ashford-Sowden  data  should  be  adequately  analyzed  by  a  cer- 
tain bivariate  model,  but  for  simplicity  we  neglect  the  multivariate 
as  well  as  multinomial  aspect  of  the  data  and  treat  it  as  if  it  con- 
sisted of  two  separate  sets  of  binomial  data,  one  for  each  symptom. 

2 
The  optimal  value  of  r  was  determined  so  that  the  x   goodness- 

of-fit  test  statistic 

9   2   (y..-y..)2 

1=1  j=i     ;.. 

be  minimized  respectively  for  each  symptom,  where  y , .  and  y  .  are  the 
observed  and  interpolated  frequencies  in  the  cell  of  the  i-th  age 
group  and  either  having  or  not  having  each  symptom  (j=l  corresponds  to 
"yes"  and  j=2  corresponds  to  "no"). 

Following  Grizzle  [1971]  and  Mantel  and  Brown  [1973],  the  normalized 
median  age  of  each  group  x  =  (median  age-17)/5  is  taken  as  the  explanatory 
variable.   For  simplicity,  we  assume  a  linear  function  of  x  for  the  right- 
hand  side  of  (3.5)  and  employ  Berkson's  minimum  chi-squares  method  to 
estimate  coefficients  for  each  given  value  of  r. 

The  estimated  generalized  logit  models  are 

(4.2)         P(Z  =0|x)  = 


[1  +  exp(-4. 012+0. 632x)]°'293 


2 
with  x  =  4.998  for  breathlessness; 


(4.3)         P(Z  =0|x)  = 


[1  +  exp(-2. 226+0. 400x)]°"360 

2 
with  x  =  3.653  for  wheeze,  as  compared  to  the  ordinary  logit  models 
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(4.4)         P(Z  =0|x)  = 


'1   '  '        1  +  exp(-4.804+0.510x) 

2 
with  x  =  16-554  for  breathlessness; 

(4.5)  P(Z2=0|x)  =  1  -  exp(_3>116+0-326x) 

2 
with  x  =  8.027  for  wheeze.   The  degrees  of  freedom  are  14  and  15, 

respectively,  for  the  generalized  and  ordinary  logit  models.   The 

interpolated  values  are  tabulated  in  Tables  1  and  2  with  observed 

values.   It  may  be  fair  to  say  that  the  improvement  is  significant 

on  the  whole.   In  particular,  it  is  remarkable  for  the  tail  areas. 

To  see  the  sensitivity  of  the  model  to  the  change  of  r  we  present 

in  Table  3  the  estimates,  a  and  3,  of  a  constant  term  and  coefficient 

2 
to  x  with  the  associated  value  of  x  statistic  for  different  values 

of  r. 

To  give  contrast  to  the  Ashford-Sowden  data  the  model  was  also 

fitted  to  Morimune's  [1976]  data  relating  private  ownership  of  a 

house  to  income.   It  turned  out  that  the  log-linear  model  is  far  more 

appropriate  than  the  linear  model.   The  estimated  generalized  logit 

model  is 

(4.6)  P(Z=0|x)  =  - r-r~ 

[1  +  exp (-9. 39 3+0. 911  log  x)  ] 

2 
with  x  =  8.616,  as  compared  to  the  ordinary  logit  model 

(4.7)  P(Z=0|x)  =  j  -  exp(_11#102+i.298  log  x) 

2 
with  x  =  9.568.  Also,  the  estimated  chisquit  model  is 

(4.8)  P(Z=0|x)  =  F  2      (-1.573+0.403  log  x) 

X  7.158 

with  x2  =  8.530. 
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In  this  example  the  improvement  of  fit  is  not  striking.   If  we 
take  into  account  the  decrease  of  the  degrees  of  freedom,  almost  no 
significant  gain  is  observed  by  generalizing  the  logit  model  by  introduc- 
ing a  transformation  parameter  r.   However,  if  you  look  at  Table  4,  you 
will  realize  that  in  tail  areas  of  the  distribution  the  goodness-of-f it 
was  improved,  albeit  slightly,  by  generalizing  the  model.   Also,  it  is 
interesting  to  note  that  the  generalized  logit  model  and  the  chisquit 
model  gave  almost  the  same  interpolated  numbers. 

It  is  straightforward  to  extend  the  model  to  the  case  of  multi- 
nomial ordered  response.  Also,  possible  further  developments  of  the 
work  in  this  note  will  include  the  expansion  to  multivariate  cases  by 
postulating  a  multivariate  Poisson  process. 


Table  1.   Ashford-Sowden  Data  on  Breathlessness 


X 

Yes 

No 

Age 

Obs. 

r=.293 

r=l 

Obs. 

r=.293 

r=l 

20-24 

1 

16 

19.0 

26.3 

1936 

1933.0 

1925.7 

25-29 

2 

32 

32.3 

39.8 

1759 

1758.7 

1751.2 

30-34 

3 

73 

69.3 

77.1 

2040 

2043.7 

2035.9 

35-39 

4 

169 

161.8 

165.2 

2614 

2621.2 

2617.8 

40-44 

5 

223 

225.0 

216.2 

2051 

2049.0 

2057.8 

45-49 

6 

357 

379.8 

356.4 

2036 

2013.2 

2036.6 

50-54 

7 

521 

494.4 

471.7 

1569 

1595.6 

1618.3 

55-59 

8 

558 

570.6 

571.9 

1192 

1179.4 

1178.1 

60-64 

9 

478 

475.2 

507.8 

658 

660.8 

628.2 

The  columns  headed  by  r=.293  and  r=T  contain  interpolated  values  by  the 
models  (11)  and  (  13  ) ,  respectively. 


Table  2.   Ashford-Sowden  Data  on  Wheeze 


X 

Yes 

No 

Age 

Obs. 

r=.360 

r=l 

Obs. 

r=.360 

r=l 

20-24 

1 

104 

102.0 

112.9 

1848 

1850.0 

1839.1 

25-29 

2 

128 

133.4 

140.4 

1663 

1657.6 

1650.6 

30-34 

3 

231 

220.3 

222.8 

1882 

1892.7 

1890.2 

35-39 

4 

378 

397.1 

390.7 

2405 

2385.9 

2392.3 

40-44 

5 

442 

432.2 

419.6 

1832 

1841.8 

1854.4 

45-49 

6 

593 

587.5 

571.1 

1800 

1805.5 

1821.9 

50-54 

7 

649 

641.8 

632.8 

1441 

1448.2 

1457.2 

55-59 

8 

631 

651.0 

657.4 

1119 

1099.0 

1092.6 

60-64 

9 

504 

495.0 

514.6 

628 

637.0 

617.4 

The  columns  headed  by  r=.360  and  r=l  contain  interpolated  values  by  the 
models  (  12  )  and  (  14  )  . 


2 
Table  3.   Estimates  and  x   Statistic  for  Different 

Values  of  r:   Ashford-Sowden  Wheeze  Data 


2 
X 


10 

-5.345 

8 

-5.124 

6 

-4.840 

4 

-4.442 

2 

-3.769 

1 

-3.116 

.8 

-2.913 

.6 

-2.657 

.4 

-2.313 

.3 

-2.083 

.2 

-1.785 

.1 

-1.363 

289 

14.74 

290 

14.50 

292 

14.11 

295 

13.35 

306 

11.29 

326 

8.03 

336 

6.79 

354 

5.23 

388 

3.75 

423 

3.97 

493 

7.86 

709 

30.33 
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Footnotes 


1.  Gart  [1964]  analyzed  the  case  when  X(x)  is  a  linear  function  of 

a  single  explanatory  variable  without  a  constant  term.   Jorgenson 
[1961]  developed  the  maximum  likelihood  estimation  for  the  case 
when  X(x)  is  a  linear  function  of  several  variables.   As  far  as 
I  know,  there  have  been  quite  a  few  applications  of  the  Poisson 
regression  analysis  to  real  problems. 

2.  The  rate  parameter  is  not  independent  of  the  choice  of  the  scale 
of  measuring  length.   However,  if  we  assume  an  exponential  or 
multiplicative  function  for  X(x),  it  has  no  effect  on  relevant 
coefficients  of  the  variables  x;  i.e.,  only  a  constant  term  is 
affected  by  the  choice  of  the  scale. 

3.  More  details  of  the  tolerance  model  are  referred  to  Cox  [1970]. 
In  the  context  of  econometric  analysis,  the  underlying  structure 
of  the  model  is  often  described  by  postulating  the  existence  of 
the  random  utility  instead  of  the  tolerance. 

4.  The  chibit  model  defined  by  (3)  may  be  regarded  as  a  reduced 

form  of  the  binary  response  model  defined  on  a  Poisson  process  with 
varying  parameter  A(x).   In  this  case  the  parameter  k  must  be  an 
integer.   It  is  possible,  however,  to  view  (3)  as  a  version  of 
the  tolerance  model.   Then  k  need  not  be  an  integer. 

5.  The  derivation  of  the  negative  binomial  distribution  as  a  com- 
pounding Poisson  and  Gamma  distribution  is  found  in  most  text- 
books.  See,  for  instance,  Johnson  and  Kotz  [1969].   This 
distribution  is  also  derived  by  assuming  different  sorts  of  under- 
lying chance  mechanisms.   A  comprehensive  review  is  given  by 
Boswell  and  Patil  [1970] .   Under  certain  circumstances  it  may 
produce  a  more  reasonable  physical  interpretation  to  postulate 
another  underlying  chance  mechanism  instead  of  the  compound 
Poisson  regression. 
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